Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 6720 |
| Missing cells | 11244 |
| Missing cells (%) | 8.8% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 997.6 KiB |
| Average record size in memory | 152.0 B |
Variable types
| Categorical | 8 |
|---|---|
| Numeric | 11 |
mifid_money_other_brokers is highly correlated with mifid_invested_other_brokers | High correlation |
mifid_invested_other_brokers is highly correlated with mifid_money_other_brokers | High correlation |
finish_mifid_days has 2900 (43.2%) missing values | Missing |
first_deposit_days has 4733 (70.4%) missing values | Missing |
first_trade_investor_account_demo_days has 3608 (53.7%) missing values | Missing |
start_mifid_days has 4844 (72.1%) zeros | Zeros |
finish_mifid_days has 801 (11.9%) zeros | Zeros |
first_deposit_days has 89 (1.3%) zeros | Zeros |
first_deposit_amount has 4733 (70.4%) zeros | Zeros |
first_deposit_platform has 729 (10.8%) zeros | Zeros |
mifid_actual_savings has 650 (9.7%) zeros | Zeros |
mifid_next_year_savings has 650 (9.7%) zeros | Zeros |
mifid_invested_other_brokers has 3593 (53.5%) zeros | Zeros |
first_trade_investor_account_demo_days has 1807 (26.9%) zeros | Zeros |
days_until_conversion_or_today has 70 (1.0%) zeros | Zeros |
Reproduction
| Analysis started | 2021-06-02 17:18:10.742672 |
|---|---|
| Analysis finished | 2021-06-02 17:19:04.676566 |
| Duration | 53.93 seconds |
| Software version | pandas-profiling v2.13.0 |
| Download configuration | config.yaml |
user_currency
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.6 KiB |
| USD | |
|---|---|
| EUR | |
| GBP | 318 |
| NO_CURRENCY | 2 |
Length
| Max length | 11 |
|---|---|
| Median length | 3 |
| Mean length | 3.002380952 |
| Min length | 3 |
Characters and Unicode
| Total characters | 20176 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | EUR |
|---|---|
| 2nd row | USD |
| 3rd row | GBP |
| 4th row | USD |
| 5th row | USD |
| Value | Count | Frequency (%) |
| USD | 3279 | |
| EUR | 3121 | |
| GBP | 318 | 4.7% |
| NO_CURRENCY | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| usd | 3279 | |
| eur | 3121 | |
| gbp | 318 | 4.7% |
| no_currency | 2 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| U | 6402 | |
| S | 3279 | |
| D | 3279 | |
| R | 3125 | |
| E | 3123 | |
| G | 318 | 1.6% |
| B | 318 | 1.6% |
| P | 318 | 1.6% |
| N | 4 | < 0.1% |
| C | 4 | < 0.1% |
| Other values (3) | 6 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 20174 | |
| Connector Punctuation | 2 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| U | 6402 | |
| S | 3279 | |
| D | 3279 | |
| R | 3125 | |
| E | 3123 | |
| G | 318 | 1.6% |
| B | 318 | 1.6% |
| P | 318 | 1.6% |
| N | 4 | < 0.1% |
| C | 4 | < 0.1% |
| Other values (2) | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| _ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 20174 | |
| Common | 2 | < 0.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| U | 6402 | |
| S | 3279 | |
| D | 3279 | |
| R | 3125 | |
| E | 3123 | |
| G | 318 | 1.6% |
| B | 318 | 1.6% |
| P | 318 | 1.6% |
| N | 4 | < 0.1% |
| C | 4 | < 0.1% |
| Other values (2) | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| _ | 2 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 20176 |
Most frequent character per block
| Value | Count | Frequency (%) |
| U | 6402 | |
| S | 3279 | |
| D | 3279 | |
| R | 3125 | |
| E | 3123 | |
| G | 318 | 1.6% |
| B | 318 | 1.6% |
| P | 318 | 1.6% |
| N | 4 | < 0.1% |
| C | 4 | < 0.1% |
| Other values (3) | 6 | < 0.1% |
user_country
Real number (ℝ≥0)
| Distinct | 122 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 47.54985119 |
| Minimum | 0 |
|---|---|
| Maximum | 121 |
| Zeros | 11 |
| Zeros (%) | 0.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 8 |
| Q1 | 31 |
| median | 36 |
| Q3 | 67.25 |
| 95-th percentile | 114 |
| Maximum | 121 |
| Range | 121 |
| Interquartile range (IQR) | 36.25 |
Descriptive statistics
| Standard deviation | 29.98508658 |
|---|---|
| Coefficient of variation (CV) | 0.6306031635 |
| Kurtosis | 0.1333836667 |
| Mean | 47.54985119 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 1.040233224 |
| Sum | 319535 |
| Variance | 899.1054175 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 36 | 2205 | |
| 41 | 369 | 5.5% |
| 77 | 355 | 5.3% |
| 25 | 329 | 4.9% |
| 6 | 270 | 4.0% |
| 38 | 219 | 3.3% |
| 85 | 209 | 3.1% |
| 120 | 203 | 3.0% |
| 22 | 197 | 2.9% |
| 24 | 185 | 2.8% |
| Other values (112) | 2179 |
| Value | Count | Frequency (%) |
| 0 | 11 | 0.2% |
| 1 | 35 | |
| 2 | 1 | < 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 121 | 63 | 0.9% |
| 120 | 203 | |
| 119 | 2 | < 0.1% |
| 118 | 1 | < 0.1% |
| 117 | 39 | 0.6% |
| Distinct | 354 |
|---|---|
| Distinct (%) | 5.3% |
| Missing | 3 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20.47372339 |
| Minimum | 0 |
|---|---|
| Maximum | 1090 |
| Zeros | 4844 |
| Zeros (%) | 72.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 116 |
| Maximum | 1090 |
| Range | 1090 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 82.95062164 |
|---|---|
| Coefficient of variation (CV) | 4.051565026 |
| Kurtosis | 47.63581595 |
| Mean | 20.47372339 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.278593753 |
| Sum | 137522 |
| Variance | 6880.80563 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 4844 | |
| 1 | 339 | 5.0% |
| 2 | 140 | 2.1% |
| 3 | 99 | 1.5% |
| 4 | 63 | 0.9% |
| 5 | 62 | 0.9% |
| 7 | 57 | 0.8% |
| 6 | 45 | 0.7% |
| 8 | 38 | 0.6% |
| 11 | 31 | 0.5% |
| Other values (344) | 999 | 14.9% |
| Value | Count | Frequency (%) |
| 0 | 4844 | |
| 1 | 339 | 5.0% |
| 2 | 140 | 2.1% |
| 3 | 99 | 1.5% |
| 4 | 63 | 0.9% |
| Value | Count | Frequency (%) |
| 1090 | 2 | |
| 1038 | 1 | |
| 967 | 1 | |
| 882 | 1 | |
| 856 | 1 |
has_finished_mifid
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.6 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6720 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 1 | 3818 | |
| 0 | 2902 |
| Value | Count | Frequency (%) |
| 1 | 3818 | |
| 0 | 2902 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 3818 | |
| 0 | 2902 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6720 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 3818 | |
| 0 | 2902 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6720 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 1 | 3818 | |
| 0 | 2902 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6720 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 3818 | |
| 0 | 2902 |
| Distinct | 350 |
|---|---|
| Distinct (%) | 9.2% |
| Missing | 2900 |
| Missing (%) | 43.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 36.01439791 |
| Minimum | 0 |
|---|---|
| Maximum | 1090 |
| Zeros | 801 |
| Zeros (%) | 11.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 13 |
| 95-th percentile | 221 |
| Maximum | 1090 |
| Range | 1090 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 105.6658859 |
|---|---|
| Coefficient of variation (CV) | 2.933990072 |
| Kurtosis | 26.95379301 |
| Mean | 36.01439791 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 4.769749555 |
| Sum | 137575 |
| Variance | 11165.27945 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 835 | 12.4% |
| 0 | 801 | 11.9% |
| 2 | 399 | 5.9% |
| 3 | 234 | 3.5% |
| 4 | 135 | 2.0% |
| 5 | 101 | 1.5% |
| 6 | 76 | 1.1% |
| 7 | 65 | 1.0% |
| 8 | 62 | 0.9% |
| 10 | 52 | 0.8% |
| Other values (340) | 1060 | 15.8% |
| (Missing) | 2900 |
| Value | Count | Frequency (%) |
| 0 | 801 | |
| 1 | 835 | |
| 2 | 399 | |
| 3 | 234 | 3.5% |
| 4 | 135 | 2.0% |
| Value | Count | Frequency (%) |
| 1090 | 1 | |
| 1040 | 1 | |
| 968 | 1 | |
| 936 | 1 | |
| 882 | 1 |
has_deposit
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.6 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6720 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 0 | 4733 | |
| 1 | 1987 |
| Value | Count | Frequency (%) |
| 0 | 4733 | |
| 1 | 1987 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4733 | |
| 1 | 1987 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6720 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 4733 | |
| 1 | 1987 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6720 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 4733 | |
| 1 | 1987 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6720 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 4733 | |
| 1 | 1987 |
| Distinct | 314 |
|---|---|
| Distinct (%) | 15.8% |
| Missing | 4733 |
| Missing (%) | 70.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 60.10065425 |
| Minimum | 0 |
|---|---|
| Maximum | 1050 |
| Zeros | 89 |
| Zeros (%) | 1.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 11 |
| Q3 | 47 |
| 95-th percentile | 304.7 |
| Maximum | 1050 |
| Range | 1050 |
| Interquartile range (IQR) | 43 |
Descriptive statistics
| Standard deviation | 127.7414225 |
|---|---|
| Coefficient of variation (CV) | 2.125458102 |
| Kurtosis | 17.60549262 |
| Mean | 60.10065425 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 3.825223806 |
| Sum | 119420 |
| Variance | 16317.87103 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 148 | 2.2% |
| 2 | 139 | 2.1% |
| 3 | 102 | 1.5% |
| 5 | 98 | 1.5% |
| 4 | 94 | 1.4% |
| 0 | 89 | 1.3% |
| 6 | 80 | 1.2% |
| 8 | 68 | 1.0% |
| 7 | 65 | 1.0% |
| 10 | 42 | 0.6% |
| Other values (304) | 1062 | 15.8% |
| (Missing) | 4733 |
| Value | Count | Frequency (%) |
| 0 | 89 | |
| 1 | 148 | |
| 2 | 139 | |
| 3 | 102 | |
| 4 | 94 |
| Value | Count | Frequency (%) |
| 1050 | 1 | |
| 1042 | 1 | |
| 1006 | 1 | |
| 984 | 1 | |
| 956 | 1 |
| Distinct | 271 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.843212734 |
| Minimum | 0 |
|---|---|
| Maximum | 1000 |
| Zeros | 4733 |
| Zeros (%) | 70.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1.929161201 |
| 95-th percentile | 19.29161201 |
| Maximum | 1000 |
| Range | 1000 |
| Interquartile range (IQR) | 1.929161201 |
Descriptive statistics
| Standard deviation | 29.71117724 |
|---|---|
| Coefficient of variation (CV) | 6.134600908 |
| Kurtosis | 433.6296275 |
| Mean | 4.843212734 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 17.86520604 |
| Sum | 32546.38957 |
| Variance | 882.754053 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 4733 | |
| 1.929161201 | 564 | 8.4% |
| 3.858322401 | 385 | 5.7% |
| 7.716644803 | 135 | 2.0% |
| 19.29161201 | 116 | 1.7% |
| 38.58322401 | 82 | 1.2% |
| 11.5749672 | 72 | 1.1% |
| 2.314993441 | 60 | 0.9% |
| 5.787483602 | 43 | 0.6% |
| 9.645806004 | 26 | 0.4% |
| Other values (261) | 504 | 7.5% |
| Value | Count | Frequency (%) |
| 0 | 4733 | |
| 0.03858322401 | 1 | < 0.1% |
| 0.1022455436 | 1 | < 0.1% |
| 0.1736245081 | 1 | < 0.1% |
| 0.1929161201 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1000 | 1 | < 0.1% |
| 771.6644803 | 2 | < 0.1% |
| 771.5487306 | 1 | < 0.1% |
| 462.8154179 | 1 | < 0.1% |
| 385.8322401 | 7 |
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.55639881 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 729 |
| Zeros (%) | 10.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 5 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.523474345 |
|---|---|
| Coefficient of variation (CV) | 0.9788457406 |
| Kurtosis | 1.278514348 |
| Mean | 1.55639881 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.639471514 |
| Sum | 10459 |
| Variance | 2.320974081 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 4733 | |
| 5 | 854 | 12.7% |
| 0 | 729 | 10.8% |
| 3 | 266 | 4.0% |
| 6 | 69 | 1.0% |
| 4 | 53 | 0.8% |
| 2 | 16 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 729 | 10.8% |
| 1 | 4733 | |
| 2 | 16 | 0.2% |
| 3 | 266 | 4.0% |
| 4 | 53 | 0.8% |
| Value | Count | Frequency (%) |
| 6 | 69 | 1.0% |
| 5 | 854 | |
| 4 | 53 | 0.8% |
| 3 | 266 | 4.0% |
| 2 | 16 | 0.2% |
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.748958333 |
| Minimum | 0 |
|---|---|
| Maximum | 15 |
| Zeros | 650 |
| Zeros (%) | 9.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 6 |
| median | 9 |
| Q3 | 12 |
| 95-th percentile | 13 |
| Maximum | 15 |
| Range | 15 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.045497548 |
|---|---|
| Coefficient of variation (CV) | 0.4623976242 |
| Kurtosis | -0.4247406397 |
| Mean | 8.748958333 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.7451047123 |
| Sum | 58793 |
| Variance | 16.36605041 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 2075 | |
| 13 | 1022 | |
| 5 | 749 | 11.1% |
| 0 | 650 | 9.7% |
| 7 | 649 | 9.7% |
| 6 | 641 | 9.5% |
| 8 | 431 | 6.4% |
| 9 | 266 | 4.0% |
| 10 | 130 | 1.9% |
| 11 | 65 | 1.0% |
| Other values (2) | 42 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 650 | |
| 1 | 1 | < 0.1% |
| 5 | 749 | |
| 6 | 641 | |
| 7 | 649 |
| Value | Count | Frequency (%) |
| 15 | 41 | 0.6% |
| 13 | 1022 | |
| 12 | 2075 | |
| 11 | 65 | 1.0% |
| 10 | 130 | 1.9% |
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.356845238 |
| Minimum | 0 |
|---|---|
| Maximum | 15 |
| Zeros | 650 |
| Zeros (%) | 9.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 6 |
| median | 8 |
| Q3 | 12 |
| 95-th percentile | 13 |
| Maximum | 15 |
| Range | 15 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.080421732 |
|---|---|
| Coefficient of variation (CV) | 0.4882729805 |
| Kurtosis | -0.7068966342 |
| Mean | 8.356845238 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -0.4784754684 |
| Sum | 56158 |
| Variance | 16.64984151 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 1416 | |
| 13 | 1267 | |
| 5 | 1004 | |
| 6 | 821 | |
| 7 | 717 | |
| 0 | 650 | |
| 8 | 406 | 6.0% |
| 9 | 216 | 3.2% |
| 10 | 109 | 1.6% |
| 11 | 62 | 0.9% |
| Other values (2) | 52 | 0.8% |
| Value | Count | Frequency (%) |
| 0 | 650 | |
| 1 | 1 | < 0.1% |
| 5 | 1004 | |
| 6 | 821 | |
| 7 | 717 |
| Value | Count | Frequency (%) |
| 15 | 51 | 0.8% |
| 13 | 1267 | |
| 12 | 1416 | |
| 11 | 62 | 0.9% |
| 10 | 109 | 1.6% |
mifid_qualifications
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.6 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6720 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 0 | 3560 | |
| 1 | 3160 |
| Value | Count | Frequency (%) |
| 0 | 3560 | |
| 1 | 3160 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 3560 | |
| 1 | 3160 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6720 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 3560 | |
| 1 | 3160 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6720 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 3560 | |
| 1 | 3160 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6720 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 3560 | |
| 1 | 3160 |
mifid_experience
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.6 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6720 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
| Value | Count | Frequency (%) |
| 0 | 4602 | |
| 1 | 2118 |
| Value | Count | Frequency (%) |
| 0 | 4602 | |
| 1 | 2118 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4602 | |
| 1 | 2118 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6720 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 4602 | |
| 1 | 2118 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6720 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 4602 | |
| 1 | 2118 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6720 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 4602 | |
| 1 | 2118 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.6 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6720 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
| Value | Count | Frequency (%) |
| 0 | 3593 | |
| 1 | 3127 |
| Value | Count | Frequency (%) |
| 0 | 3593 | |
| 1 | 3127 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 3593 | |
| 1 | 3127 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6720 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 3593 | |
| 1 | 3127 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6720 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 3593 | |
| 1 | 3127 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6720 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 3593 | |
| 1 | 3127 |
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.659821429 |
| Minimum | 0 |
|---|---|
| Maximum | 15 |
| Zeros | 3593 |
| Zeros (%) | 53.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 12 |
| 95-th percentile | 13 |
| Maximum | 15 |
| Range | 15 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 5.399232522 |
|---|---|
| Coefficient of variation (CV) | 1.158677989 |
| Kurtosis | -1.521899067 |
| Mean | 4.659821429 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.49006069 |
| Sum | 31314 |
| Variance | 29.15171183 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 3593 | |
| 12 | 1362 | 20.3% |
| 13 | 516 | 7.7% |
| 5 | 396 | 5.9% |
| 6 | 310 | 4.6% |
| 7 | 238 | 3.5% |
| 8 | 151 | 2.2% |
| 9 | 91 | 1.4% |
| 10 | 28 | 0.4% |
| 11 | 19 | 0.3% |
| Value | Count | Frequency (%) |
| 0 | 3593 | |
| 5 | 396 | 5.9% |
| 6 | 310 | 4.6% |
| 7 | 238 | 3.5% |
| 8 | 151 | 2.2% |
| Value | Count | Frequency (%) |
| 15 | 16 | 0.2% |
| 13 | 516 | 7.7% |
| 12 | 1362 | |
| 11 | 19 | 0.3% |
| 10 | 28 | 0.4% |
user_flow_name
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.6 KiB |
| 3 | |
|---|---|
| 0 | |
| 2 | 202 |
| 1 | 29 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6720 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 |
|---|---|
| 2nd row | 3 |
| 3rd row | 3 |
| 4th row | 3 |
| 5th row | 3 |
| Value | Count | Frequency (%) |
| 3 | 3471 | |
| 0 | 3018 | |
| 2 | 202 | 3.0% |
| 1 | 29 | 0.4% |
| Value | Count | Frequency (%) |
| 3 | 3471 | |
| 0 | 3018 | |
| 2 | 202 | 3.0% |
| 1 | 29 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 3471 | |
| 0 | 3018 | |
| 2 | 202 | 3.0% |
| 1 | 29 | 0.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6720 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 3 | 3471 | |
| 0 | 3018 | |
| 2 | 202 | 3.0% |
| 1 | 29 | 0.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6720 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 3 | 3471 | |
| 0 | 3018 | |
| 2 | 202 | 3.0% |
| 1 | 29 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6720 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 3 | 3471 | |
| 0 | 3018 | |
| 2 | 202 | 3.0% |
| 1 | 29 | 0.4% |
| Distinct | 207 |
|---|---|
| Distinct (%) | 6.7% |
| Missing | 3608 |
| Missing (%) | 53.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16.85861183 |
| Minimum | 0 |
|---|---|
| Maximum | 957 |
| Zeros | 1807 |
| Zeros (%) | 26.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 3 |
| 95-th percentile | 89.45 |
| Maximum | 957 |
| Range | 957 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 67.45858591 |
|---|---|
| Coefficient of variation (CV) | 4.001431827 |
| Kurtosis | 58.23197434 |
| Mean | 16.85861183 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.7618735 |
| Sum | 52464 |
| Variance | 4550.660813 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1807 | |
| 1 | 313 | 4.7% |
| 2 | 202 | 3.0% |
| 3 | 103 | 1.5% |
| 4 | 61 | 0.9% |
| 5 | 60 | 0.9% |
| 7 | 36 | 0.5% |
| 6 | 35 | 0.5% |
| 8 | 22 | 0.3% |
| 10 | 20 | 0.3% |
| Other values (197) | 453 | 6.7% |
| (Missing) | 3608 |
| Value | Count | Frequency (%) |
| 0 | 1807 | |
| 1 | 313 | 4.7% |
| 2 | 202 | 3.0% |
| 3 | 103 | 1.5% |
| 4 | 61 | 0.9% |
| Value | Count | Frequency (%) |
| 957 | 1 | |
| 898 | 1 | |
| 852 | 1 | |
| 737 | 1 | |
| 691 | 1 |
| Distinct | 1107 |
|---|---|
| Distinct (%) | 16.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 403.1849702 |
| Minimum | 0 |
|---|---|
| Maximum | 1128 |
| Zeros | 70 |
| Zeros (%) | 1.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 52.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 72.75 |
| median | 339 |
| Q3 | 695 |
| 95-th percentile | 1033 |
| Maximum | 1128 |
| Range | 1128 |
| Interquartile range (IQR) | 622.25 |
Descriptive statistics
| Standard deviation | 348.2360536 |
|---|---|
| Coefficient of variation (CV) | 0.8637128844 |
| Kurtosis | -1.012875967 |
| Mean | 403.1849702 |
| Median Absolute Deviation (MAD) | 294 |
| Skewness | 0.5259514888 |
| Sum | 2709403 |
| Variance | 121268.349 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 110 | 1.6% |
| 1 | 109 | 1.6% |
| 5 | 96 | 1.4% |
| 3 | 90 | 1.3% |
| 4 | 87 | 1.3% |
| 0 | 70 | 1.0% |
| 6 | 64 | 1.0% |
| 7 | 63 | 0.9% |
| 9 | 58 | 0.9% |
| 8 | 57 | 0.8% |
| Other values (1097) | 5916 |
| Value | Count | Frequency (%) |
| 0 | 70 | |
| 1 | 109 | |
| 2 | 110 | |
| 3 | 90 | |
| 4 | 87 |
| Value | Count | Frequency (%) |
| 1128 | 6 | |
| 1127 | 2 | < 0.1% |
| 1126 | 5 | |
| 1125 | 4 | |
| 1124 | 7 |
is_converted
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.6 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6720 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 0 | 5041 | |
| 1 | 1679 | 25.0% |
| Value | Count | Frequency (%) |
| 0 | 5041 | |
| 1 | 1679 | 25.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5041 | |
| 1 | 1679 | 25.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6720 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 5041 | |
| 1 | 1679 | 25.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6720 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 5041 | |
| 1 | 1679 | 25.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6720 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 5041 | |
| 1 | 1679 | 25.0% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| user_currency | user_country | start_mifid_days | has_finished_mifid | finish_mifid_days | has_deposit | first_deposit_days | first_deposit_amount | first_deposit_platform | mifid_actual_savings | mifid_next_year_savings | mifid_qualifications | mifid_experience | mifid_money_other_brokers | mifid_invested_other_brokers | user_flow_name | first_trade_investor_account_demo_days | days_until_conversion_or_today | is_converted | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | EUR | 22 | 0.0 | 0 | NaN | 0 | NaN | 0.0 | 1 | 9 | 12 | 0 | 0 | 0 | 0 | 3 | NaN | 1 | 0 |
| 1 | USD | 24 | 0.0 | 0 | NaN | 0 | NaN | 0.0 | 1 | 12 | 12 | 0 | 0 | 0 | 0 | 3 | 0.0 | 1 | 0 |
| 2 | GBP | 41 | 0.0 | 1 | 1.0 | 0 | NaN | 0.0 | 1 | 12 | 13 | 0 | 1 | 1 | 12 | 3 | NaN | 1 | 0 |
| 3 | USD | 24 | 0.0 | 0 | NaN | 0 | NaN | 0.0 | 1 | 6 | 8 | 1 | 1 | 1 | 13 | 3 | NaN | 2 | 0 |
| 4 | USD | 114 | 1.0 | 0 | NaN | 0 | NaN | 0.0 | 1 | 7 | 7 | 0 | 1 | 1 | 6 | 3 | 0.0 | 2 | 0 |
| 5 | USD | 87 | 0.0 | 1 | 0.0 | 0 | NaN | 0.0 | 1 | 12 | 12 | 1 | 0 | 0 | 0 | 3 | NaN | 2 | 0 |
| 6 | USD | 60 | 0.0 | 0 | NaN | 0 | NaN | 0.0 | 1 | 5 | 5 | 1 | 0 | 1 | 6 | 3 | NaN | 2 | 0 |
| 7 | EUR | 36 | 0.0 | 1 | 3.0 | 0 | NaN | 0.0 | 1 | 7 | 13 | 1 | 0 | 1 | 7 | 3 | NaN | 3 | 0 |
| 8 | EUR | 81 | 0.0 | 0 | NaN | 0 | NaN | 0.0 | 1 | 12 | 12 | 0 | 0 | 1 | 12 | 3 | NaN | 3 | 0 |
| 9 | EUR | 36 | 0.0 | 0 | NaN | 0 | NaN | 0.0 | 1 | 8 | 8 | 0 | 0 | 0 | 0 | 3 | NaN | 3 | 0 |
Last rows
| user_currency | user_country | start_mifid_days | has_finished_mifid | finish_mifid_days | has_deposit | first_deposit_days | first_deposit_amount | first_deposit_platform | mifid_actual_savings | mifid_next_year_savings | mifid_qualifications | mifid_experience | mifid_money_other_brokers | mifid_invested_other_brokers | user_flow_name | first_trade_investor_account_demo_days | days_until_conversion_or_today | is_converted | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 6710 | EUR | 36 | 0.0 | 1 | 1.0 | 1 | 3.0 | 11.574967 | 5 | 6 | 13 | 1 | 1 | 0 | 0 | 0 | 26.0 | 7 | 1 |
| 6711 | EUR | 58 | 0.0 | 1 | 0.0 | 1 | 199.0 | 3.858322 | 5 | 13 | 13 | 1 | 0 | 0 | 0 | 0 | NaN | 20 | 1 |
| 6712 | USD | 91 | 811.0 | 1 | 829.0 | 1 | 1006.0 | 1.929161 | 0 | 13 | 5 | 1 | 0 | 1 | 12 | 0 | NaN | 1128 | 0 |
| 6713 | EUR | 58 | 0.0 | 1 | 0.0 | 1 | 2.0 | 1.929161 | 5 | 12 | 12 | 1 | 0 | 0 | 0 | 0 | 0.0 | 20 | 1 |
| 6714 | EUR | 36 | 0.0 | 0 | NaN | 0 | NaN | 0.000000 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | NaN | 1128 | 0 |
| 6715 | EUR | 36 | 246.0 | 1 | 246.0 | 1 | 248.0 | 4.629987 | 5 | 12 | 12 | 0 | 0 | 0 | 0 | 0 | 0.0 | 251 | 1 |
| 6716 | EUR | 38 | 0.0 | 0 | NaN | 0 | NaN | 0.000000 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.0 | 1128 | 0 |
| 6717 | EUR | 58 | 0.0 | 1 | 0.0 | 1 | 63.0 | 3.858322 | 5 | 5 | 5 | 1 | 1 | 0 | 0 | 0 | 0.0 | 63 | 1 |
| 6718 | USD | 20 | 167.0 | 1 | 169.0 | 1 | 196.0 | 1.929161 | 0 | 12 | 12 | 0 | 0 | 1 | 12 | 0 | 0.0 | 1128 | 0 |
| 6719 | USD | 84 | 0.0 | 0 | NaN | 0 | NaN | 0.000000 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | NaN | 1128 | 0 |